A Novel Gene Signature to Predict Survival Time and Incident Ventricular Arrhythmias in Patients with Dilated Cardiomyopathy

The mortality in nonischaemic dilated cardiomyopathy (NIDCM) patients is still at a high level; sudden death in NIDCM can be caused by ventricular tachycardia. It is necessary to explore the pathogenesis of ventricular arrhythmias (VA) in NIDCM. Differentially expressed genes (DEGs) were identified by comparing the gene expression of NIDCM patients with or without VA in the gene expression profile of GSE135055. A total of 228 DEGs were obtained, and 3 genes were screened out to be significantly related to the survival time of NIDCM patients. We established a prediction model on two-gene (TOMM22, PPP2R5A) signature for the survival time of NIDCM patients. The area under the curve (AUC) was 0.75 calculated by the ROC curve analysis. These risk genes are probably new targets for exploring the pathogenesis of NIDCM with VA; the prediction model for survival time and incident ventricular arrhythmias is useful in clinical decision making for individual treatment.


Introduction
Nonischaemic dilated cardiomyopathy (NIDCM) is one of the most common inherited cardiomyopathy and is considered to be one of the main causes of heart failure and sudden cardiac death. Heart transplantation is usually needed [1,2]. One-third of patients with NIDCM may represent arrhythmogenic phenotypes and have an increased risk of arrhythmias during follow-up; ventricular arrhythmia (VA) is a major cause of clinical deterioration and demise in patients with NIDCM [3][4][5][6]. In order to prevent these situations, it is recommended that prophylactic intervention with implantable cardioverter-defibrillators (ICD) in patients with heart failure and left ventricular ejection fraction < 35% in current international guidelines [7,8]. Although NIDCM is characterized by genetic and clinical heterogeneity, few studies have explored the pathogenesis of VA in patients with NIDCM. In order to reveal the inherent molecular mechanism, NIDCM patients were divided into sinus rhythm (SR) group and VA group and differentially expressed genes (DEGs) were identified via bioinformatic methods.

Data Source.
We download the microarray expression dataset (GSE135055) from the Gene Expression Omnibus (GEO) database [9]. The multilevel transcriptional data of GSE135055 were generated from the heart tissues of 21 heart failure (HF) patients and 9 healthy donors. Among these HF patients, there are 18 NIDCM patients, during which 6 patients suffered VA, including ventricular premature beats, ventricular tachycardia, and ventricular fibrillation [10]. The characteristics of these patients are listed in Table 1 (supplemental file  Table S1). We divided the NIDCM patients into two groups, with 12 patients in the sinus rhythm (SR) group and 6 patients in the VA group. We used the "limma" package [11] to quantile the datasets and screened out DEGs between the SR group and VA group with p < 0:05 and |log 2FC | >1 were considered statistically significant. A volcano map and heat map were used to show the DEGs.

Construction of Survival Time Prediction
Model. 17 NIDCM patients (6 patients suffered VA included) with complete survival time (time period of each patient from symptoms to heart transplantation) information were selected for the study. By using univariate Cox regression analysis, we screened out DEG significantly related to survival time (p < 0:05) [12]. Then, a gene signature and a prediction model for survival time were constructed via multivariate cox analysis, survival analysis, and random survival forest algorithm.

Evaluation of Survival Time Prediction
Model. Based on the regression coefficient and the expression value of each selected gene obtained by the multivariate Cox regression model, the risk score of each patient was calculated, then we separated 17 patients into high-risk and low-risk groups using the median risk score as the cutoff. A high-risk score indicates poor survival time for the NIDCM patients. The accuracy of the prediction model was evaluated by timedependent ROC analysis.

Enrichment Analysis by
Metascape. The pathway enrichment of DEGs was analyzed using Metascape. With the species limited to "Homo sapiens," functional enrichment analysis was performed based on Gene Ontology, Kyoto Encyclopedia of Genes and Genomes, Reactive pathways, and Canonical Pathways [13]. All genes in the genome have been used as the enrichment background. Terms with a p value < 0.01, a minimum count of 3, and an enrichment factor > 1:5 are collected and grouped into clusters based on their membership similarities.   Table 2). The 3 genes were used to establish a model for predicting survival time of NIDCM patients. Multivariate Cox proportional hazard regression analysis was performed to confirm the optimal model and a forest map was shown in Figure 2. PPP2R5A and TOMM22 were identified as risk genes in the survival time model. We confirmed PPP2R5A and TOMM22 as high-risk genes in the survival time model which are negatively correlated with patient prognosis (Table 3). Patients were subdivided into high and low-risk groups for survival time based on the median risk scores. Survival analysis revealed that high-risk scores were significantly related to poor survival time (symptoms to heart transplantation). The 2-year and 5-year survival rates for the high-risk patients were 37.5% and 25%, whereas the 8-year and 11-year survival rates for the low-risk patients were 55.6% and 44.4%. We then measured the predictive performance of the prognostic risk models using the time-dependent receiver operating characteristic (ROC) curves. The area under the curve (AUC) was 0.75, which indicated superior predictive accuracy in NIDCM patients for survival time (Figure 3).

Function Enrichment Analysis by
Metascape. The Metascape function enrichment results for DEGs including 204 Gene Ontology terms, 53 Reactome pathways, and 6 Kyoto Encyclopedia of Genes and Genomes pathways. The risk gene PPP2R5A is mainly involved in Influenza Infection, protein localization to membrane, PID IGF1 PATHWAY, CTLA4 inhibitory signaling, Platelet activation, signaling and aggregation, and Hemostasis. TOMM22 is mainly enriched in Influenza Infection, establishment of protein localization to organelle, establishment of protein localization to membrane, protein targeting, protein localization to 2 Disease Markers membrane, mitochondrion organization, autophagy, process utilizing autophagic mechanism, macroautophagy, regulation of mRNA metabolic process, and mitochondrial membrane organization (Figure 4).

Discussion
Due to high incidence, poor therapeutic effect, and prognosis, NIDCM is a major health problem that threatens human health, it is necessary to explore its pathogenesis and establish an accurate prediction prognosis model [14][15][16]. NIDCM is characterized as a genetically determined disease; the genetic basis of NIDCM highlights the importance of screening biomarkers for diagnosis and prognosis [17][18][19]. Although the survival time of patients with NIDCM is quite different [20,21], there are few methods to predict the survival time after being diagnosed as NIDCM. We first established a novel two-gene signature to predict survival in patients with NIDCM. To our knowledge, the two-gene signature related prognostic model for NIDCM has not been reported previously.  Figure 1: DEGs between SR and VA. (a) Volcano map of DEGs between SR and VA. Red represents upregulated differential genes, black represents no significant difference genes, and green represents downregulated differential genes. (b) Heat map of all DEGs between SR and VA. Each column represents a tissue sample, and each row represents a DEG. The gradual color change from green to red indicates the changing process from downregulation to upregulation.   The risk score was based on mRNA expression but not somatic mutations or methylation status of only two prognostic genes, which could be more routine and cost-effective in practice.
VA is common in patients with NIDCM. Ventricular fibrillation and cardiac arrest are important reasons for death in patients with NIDCM [22,23], while there have been few studies focusing solely on VA in NIDCM. Although sudden cardiac death rates have decreased, implantation of ICD for primary prevention in NIDCM patients does not provide overall survival benefits [24]. Establishing a prediction model for incident VA in patients with NIDCM helps make personalized treatment to improve the survival, which can be used as a reference index of whether patients need to be implanted with ICD or not. In this study, we established a novel two-gene signature (including TOMM22 and PPP2R5A) to predict the occurrence of VA in patients. Consistent with our research, some studies support PPP2R5A as a novel target for the treatment of arrhythmia. PPP2R5A, protein phosphatase 2 regulatory subunit B'alpha, encodes an alpha isoform of   Disease Markers the regulatory subunit B56 subfamily. Downregulated B56α myocytes are insensitive to isoproterenol-induced induction of arrhythmogenic Na + channel late component. Voltagegated Na + channel 1.5 is critical for normal cardiac excitability, PP2A-B56α complex proved to interact with the primary cardiac voltage-gated Na + channel 1.5 [25]. TOMM22, Translocase of outer mitochondrial membrane 22, is responsible for the recognition and translocation of synthesized mitochondrial precursor proteins; phosphorylation of TOMM22 is a critical switch for mitophagy [26,27].
Combined with the functional enrichment analysis results above, TOMM22 may regulate mitochondrial autophagy in NIDCM, which needs to be confirmed by further research. Tachycardia-induced cardiomyopathy (TIC) is characterized by diverse tachyarrhythmias, including supraventricular arrhythmias (such as atrial fibrillation) and ventricular arrhythmias [28,29]. PPP2R5A and TOMM22 show significantly different expression levels between the VA group and SR group in NIDCM patients, which suggests they may be related to the pathogenesis of TIC. More study needs to However, there are some limitations in our research. We studied a small number of patients, although the ROC value (0.75) indicated superior predictive accuracy of the prediction model for survival time in NIDCM patients. In future research, we will carry out a large sample, randomized, and follow-up studies. Besides, the time period of each patient from symptoms to heart transplantation still can represent the prognosis of patients with NIDCM to some extent, overall survival data should be more accurate, and we will carry out patients' long-term follow-up to collect more detail clinical information.

Conclusions
Our study identified 3 new genes that were significantly related to the survival time of NIDCM and established a novel two-gene signature to predict survival time and incident VA of NIDCM. Although the sample size of our study is relatively small, these risk genes are probably new targets for exploring the pathogenesis of NIDCM with VA, and the prediction model is useful in clinical decision making for individual treatment.
Abbreviations NIDCM: Nonischaemic dilated cardiomyopathy VA: Ventricular arrhythmias GEO: Gene Expression Omnibus DEGs: Differentially expressed genes SR: Sinus rhythm ROCs: Receiver operating characteristics AUC: Area under the curve HF: Heart failure.

Data Availability
The dataset with mRNA expression profiling was taken from GEO: GSE135055. The associated corresponding clinical information of patients in this study was downloaded from BMC Medicine (doi:10.1186/s12916-019-1469-4).

Ethical Approval
The databases are publicly available and open access, and the present study followed the data access policy and publishing guidelines of these databases. Therefore, no local ethics committee is required to approve this study.

Conflicts of Interest
The authors declare that they have no conflicts of interest.